Combining frame and segment based models for environmental sound classification

نویسندگان

  • Pengfei Hu
  • Wenju Liu
  • Wei Jiang
چکیده

The paper considers the task of recognizing environmental sounds, which plays a critical role in human’s perception of an auditory context in audiovisual materials. A variety of features have been proposed for audio recognition, either frame-based or segmental. Here, we propose a two-stage framework to combine modeling in these two levels. First, the Gaussian Mixture Models(GMMs) are built based on short-term features and preclassification are performed. Then, in the event that the GMMs are not certain about the result, the system engages Support Vector Machines (SVMs) to refine the output hypothesis. In the next stage, the features are combined by taking posterior estimates of GMMs along with segmental features as SVMs’ input features. Experiments on the sound dataset show that the proposed framework makes an improvement over the traditional methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic classification of normal and abnormal cardiac sounds by combining features based on wavelet transform and capstral coefficients extracted from PCG signals (Research Article)

Cardiac sounds are produced by the mechanical activities of the heart and provide useful information about the function of the heart valves. Due to the transient and unstable nature of the heart's sound and the limitation of the human hearing system, it is difficult to categorize heart sound signals based on what is heard from a stethoscope. Therefore, providing an automated algorithm for prima...

متن کامل

Parallel and hierarchical speech feature classification using frame and segment-based methods

Phonemes in the English language can be represented using either parallel or hierarchical distinctive speech features. There have been a number of efforts to integrate multiple information sources but none of these efforts addressed the issue of combining multiple sets of articulatory/linguistic features with different organization topologies. In this study, we combine a frame-based parallel sp...

متن کامل

PROTAX-Sound: A probabilistic framework for automated animal sound identification

Autonomous audio recording is stimulating new field in bioacoustics, with a great promise for conducting cost-effective species surveys. One major current challenge is the lack of reliable classifiers capable of multi-species identification. We present PROTAX-Sound, a statistical framework to perform probabilistic classification of animal sounds. PROTAX-Sound is based on a multinomial regressio...

متن کامل

Palarimetric Synthetic Aperture Radar Image Classification using Bag of Visual Words Algorithm

Land cover is defined as the physical material of the surface of the earth, including different vegetation covers, bare soil, water surface, various urban areas, etc. Land cover and its changes are very important and influential on the Earth and life of living organisms, especially human beings. Land cover change monitoring is important for protecting the ecosystem, forests, farmland, open spac...

متن کامل

Spectral-spatial classification of hyperspectral images by combining hierarchical and marker-based Minimum Spanning Forest algorithms

Many researches have demonstrated that the spatial information can play an important role in the classification of hyperspectral imagery. This study proposes a modified spectral–spatial classification approach for improving the spectral–spatial classification of hyperspectral images. In the proposed method ten spatial/texture features, using mean, standard deviation, contrast, homogeneity, corr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012